Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions

نویسندگان

Arseniy Gorin

Rasa Lileikyte

Guangpu Huang

Lori Lamel

Jean-Luc Gauvain

Antoine Laurent

چکیده

This research extends our earlier work on using machine translation (MT) and word-based recurrent neural networks to augment language model training data for keyword search in conversational Cantonese speech. MT-based data augmentation is applied to two language pairs: English-Lithuanian and English-Amharic. Using filtered N-best MT hypotheses for language modeling is found to perform better than just using the 1best translation. Target language texts collected from the Web and filtered to select conversational-like data are used in several manners. In addition to using Web data for training the language model of the speech recognizer, we further investigate using this data to improve the language model and phrase table of the MT system to get better translations of the English data. Finally, generating text data with a character-based recurrent neural network is investigated. This approach allows new word forms to be produced, providing a way to reduce the out-of-vocabulary rate and thereby improve keyword spotting performance. We study how these different methods of language model data augmentation impact speech-to-text and keyword spotting performance for the Lithuanian and Amharic languages. The best results are obtained by combining all of the explored methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language independent and unsupervised acoustic models for speech recognition and keyword spotting

Developing high-performance speech processing systems for low-resource languages is very challenging. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to train a multi-language bottleneck DNN. Language dependent and/or multi-language (all training languages) Tandem acoustic models are then trained. This work con...

متن کامل

Data augmentation for low resource languages

Recently there has been interest in the approaches for training speech recognition systems for languages with limited resources. Under the IARPA Babel program such resources have been provided for a range of languages to support this research area. This paper examines a particular form of approach, data augmentation, that can be applied to these situations. Data augmentation schemes aim to incr...

متن کامل

Spanish Keyword Spotting System Based on Filler Models, Pseudo N-gram Language Model and a Confidence Measure

In order to organize efficiently lots of hours of audio contents such as meetings, radio news, search for spoken keywords is essential. An approach uses filler models to account for non-keyword intervals. Another approach uses a large vocabulary continuous speech recognition system (LVCSR) which retrieves a word string and then search for the keywords in this string. This approach yields high p...

متن کامل

Cross-language phoneme mapping for phonetic search keyword spotting using multiple source languages

Performing Phonetic Search Keyword Spotting (PS KWS) in new languages when language resources are scarce is an interesting and challenging task. In a previous paper we reported a methodology that enabled PS KWS under these conditions utilizing cross-language phoneme mappings from another sufficiently resourced and well-trained source language. We performed phoneme recognition in the new target ...

متن کامل

Enhancing low resource keyword spotting with automatically retrieved web documents

Keyword Spotting (KWS) systems developed for low resource languages with very little transcribed audio suffer due to a small vocabulary (high out-of-vocabulary (OOV) rate) and a weak language model. In this paper, we propose to augment such systems using automatically retrieved web documents. Our procedure can find large volumes of web documents similar to a small pool of training transcription...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions

نویسندگان

چکیده

منابع مشابه

Language independent and unsupervised acoustic models for speech recognition and keyword spotting

Data augmentation for low resource languages

Spanish Keyword Spotting System Based on Filler Models, Pseudo N-gram Language Model and a Confidence Measure

Cross-language phoneme mapping for phonetic search keyword spotting using multiple source languages

Enhancing low resource keyword spotting with automatically retrieved web documents

عنوان ژورنال:

اشتراک گذاری